Pesquisa | Portal Regional da BVS

1.

PAC Reinforcement Learning Algorithm for General-Sum Markov Games.

Zehfroosh, Ashkan; Tanner, Herbert G.

IEEE Trans Automat Contr ; 68(5): 2821-2831, 2023 May.

Artigo em Inglês | MEDLINE | ID: mdl-37915545

RESUMO

This paper presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, the paper extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PACMARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate the algorithm's performance and robustness.

2.

Non-Smooth Control Barrier Navigation Functions for STL Motion Planning.

Zehfroosh, Ashkan; Tanner, Herbert G.

Front Robot AI ; 9: 782783, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35494541

RESUMO

This paper reports on a new approach to Signal Temporal Logic (STL) control synthesis, that 1) utilizes a navigation function as the basis to construct a Control Barrier Function (CBF), and 2) composes navigation function-based barrier functions using nonsmooth mappings to encode Boolean operations between the predicates that those barrier functions encode. Because of these two key features, the reported approach 1) covers a larger fragment of STL compared to existing approaches, 2) alleviates the computational cost associated with evaluation of the control law for the system in existing STL control barrier function methodologies, and 3) simultaneously relaxes some of the conservativeness of smooth combinations of barrier functions as a means of implementing Boolean operators. The paper demonstrates the efficacy of this new approach with three simulation case studies, one aiming at illustrating how complex STL motion planning specification can be realized, the second highlights the less-conservativeness of the approach in comparison to the existing methods, and another that shows how this technology can be brought to bear to push the envelope in the context of human-robot social interaction.

3.

A Hybrid PAC Reinforcement Learning Algorithm for Human-Robot Interaction.

Zehfroosh, Ashkan; Tanner, Herbert G.

Front Robot AI ; 9: 797213, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35391942

RESUMO

This paper offers a new hybrid probably approximately correct (PAC) reinforcement learning (RL) algorithm for Markov decision processes (MDPs) that intelligently maintains favorable features of both model-based and model-free methodologies. The designed algorithm, referred to as the Dyna-Delayed Q-learning (DDQ) algorithm, combines model-free Delayed Q-learning and model-based R-max algorithms while outperforming both in most cases. The paper includes a PAC analysis of the DDQ algorithm and a derivation of its sample complexity. Numerical results are provided to support the claim regarding the new algorithm's sample efficiency compared to its parents as well as the best known PAC model-free and model-based algorithms in application. A real-world experimental implementation of DDQ in the context of pediatric motor rehabilitation facilitated by infant-robot interaction highlights the potential benefits of the reported method.

4.

GEARing smart environments for pediatric motor rehabilitation.

Kokkoni, Elena; Mavroudi, Effrosyni; Zehfroosh, Ashkan; Galloway, James C; Vidal, Renè; Heinz, Jeffrey; Tanner, Herbert G.

J Neuroeng Rehabil ; 17(1): 16, 2020 02 10.

Artigo em Inglês | MEDLINE | ID: mdl-32041623

RESUMO

BACKGROUND: There is a lack of early (infant) mobility rehabilitation approaches that incorporate natural and complex environments and have the potential to concurrently advance motor, cognitive, and social development. The Grounded Early Adaptive Rehabilitation (GEAR) system is a pediatric learning environment designed to provide motor interventions that are grounded in social theory and can be applied in early life. Within a perceptively complex and behaviorally natural setting, GEAR utilizes novel body-weight support technology and socially-assistive robots to both ease and encourage mobility in young children through play-based, child-robot interaction. This methodology article reports on the development and integration of the different system components and presents preliminary evidence on the feasibility of the system. METHODS: GEAR consists of the physical and cyber components. The physical component includes the playground equipment to enrich the environment, an open-area body weight support (BWS) device to assist children by partially counter-acting gravity, two mobile robots to engage children into motor activity through social interaction, and a synchronized camera network to monitor the sessions. The cyber component consists of the interface to collect human movement and video data, the algorithms to identify the children's actions from the video stream, and the behavioral models for the child-robot interaction that suggest the most appropriate robot action in support of given motor training goals for the child. The feasibility of both components was assessed via preliminary testing. Three very young children (with and without Down syndrome) used the system in eight sessions within a 4-week period. RESULTS: All subjects completed the 8-session protocol, participated in all tasks involving the selected objects of the enriched environment, used the BWS device and interacted with the robots in all eight sessions. Action classification algorithms to identify early child behaviors in a complex naturalistic setting were tested and validated using the video data. Decision making algorithms specific to the type of interactions seen in the GEAR system were developed to be used for robot automation. CONCLUSIONS: Preliminary results from this study support the feasibility of both the physical and cyber components of the GEAR system and demonstrate its potential for use in future studies to assess the effects on the co-development of the motor, cognitive, and social systems of very young children with mobility challenges.

Assuntos

Relações Interpessoais , Limitação da Mobilidade , Atividade Motora , Aparelhos Ortopédicos , Robótica/métodos , Algoritmos , Pré-Escolar , Deficiências do Desenvolvimento/reabilitação , Síndrome de Down/reabilitação , Feminino , Humanos , Lactente , Masculino

5.

Statistical Relational Learning With Unconventional String Models.

Vu, Mai H; Zehfroosh, Ashkan; Strother-Garcia, Kristina; Sebok, Michael; Heinz, Jeffrey; Tanner, Herbert G.

Front Robot AI ; 5: 76, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-33500955

RESUMO

This paper shows how methods from statistical relational learning can be used to address problems in grammatical inference using model-theoretic representations of strings. These model-theoretic representations are the basis of representing formal languages logically. Conventional representations include a binary relation for order and unary relations describing mutually exclusive properties of each position in the string. This paper presents experiments on the learning of formal languages, and their stochastic counterparts, with unconventional models, which relax the mutual exclusivity condition. Unconventional models are motivated by domain-specific knowledge. Comparison of conventional and unconventional word models shows that in the domains of phonology and robotic planning and control, Markov Logic Networks With unconventional models achieve better performance and less runtime with smaller networks than Markov Logic Networks With conventional models.

6.

Learning models of Human-Robot Interaction from small data.

Zehfroosh, Ashkan; Kokkoni, Elena; Tanner, Herbert G; Heinz, Jeffrey.

Mediterr Conf Control Automation ; 2017: 223-228, 2017 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-29492408

RESUMO

This paper offers a new approach to learning discrete models for human-robot interaction (HRI) from small data. In the motivating application, HRI is an integral part of a pediatric rehabilitation paradigm that involves a play-based, social environment aiming at improving mobility for infants with mobility impairments. Designing interfaces in this setting is challenging, because in order to harness, and eventually automate, the social interaction between children and robots, a behavioral model capturing the causality between robot actions and child reactions is needed. The paper adopts a Markov decision process (MDP) as such a model, and selects the transition probabilities through an empirical approximation procedure called smoothing. Smoothing has been successfully applied in natural language processing (NLP) and identification where, similarly to the current paradigm, learning from small data sets is crucial. The goal of this paper is two-fold: (i) to describe our application of HRI, and (ii) to provide evidence that supports the application of smoothing for small data sets.

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA